NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Regulation of regeneration in Arabidopsis thaliana

https://doi.org/10.1007/s42994-023-00121-9

Islam, Md Khairul; Mummadi, Sai Teja; Liu, Sanzhen; Wei, Hairong (November 2023, aBIOTECH)

Abstract We employed several algorithms with high efficacy to analyze the public transcriptomic data, aiming to identify key transcription factors (TFs) that regulate regeneration inArabidopsis thaliana. Initially, we utilized CollaborativeNet, also known as TF-Cluster, to construct a collaborative network of all TFs, which was subsequently decomposed into many subnetworks using the Triple-Link and Compound Spring Embedder (CoSE) algorithms. Functional analysis of these subnetworks led to the identification of nine subnetworks closely associated with regeneration. We further applied principal component analysis and gene ontology (GO) enrichment analysis to reduce the subnetworks from nine to three, namely subnetworks 1, 12, and 17. Searching for TF-binding sites in the promoters of the co-expressed and co-regulated (CCGs) genes of all TFs in these three subnetworks and Triple-Gene Mutual Interaction analysis of TFs in these three subnetworks with the CCGs involved in regeneration enabled us to rank the TFs in each subnetwork. Finally, six potential candidate TFs—WOX9A, LEC2, PGA37, WIP5, PEI1, and AIL1 from subnetwork 1—were identified, and their roles in somatic embryogenesis (GO:0010262) and regeneration (GO:0031099) were discussed, so were the TFs in Subnetwork 12 and 17 associated with regeneration. The TFs identified were also assessed using the CIS-BP database and Expression Atlas. Our analyses suggest some novel TFs that may have regulatory roles in regeneration and embryogenesis and provide valuable data and insights into the regulatory mechanisms related to regeneration. The tools and the procedures used here are instrumental for analyzing high-throughput transcriptomic data and advancing our understanding of the regulation of various biological processes of interest.
more » « less
Full Text Available
TGPred: efficient methods for predicting target genes of a transcription factor by integrating statistics, machine learning and optimization

https://doi.org/10.1093/nargab/lqad083

Cao, Xuewei; Zhang, Ling; Islam, Md Khairul; Zhao, Mingxia; He, Cheng; Zhang, Kui; Liu, Sanzhen; Sha, Qiuying; Wei, Hairong (July 2023, NAR Genomics and Bioinformatics)

Abstract Four statistical selection methods for inferring transcription factor (TF)–target gene (TG) pairs were developed by coupling mean squared error (MSE) or Huber loss function, with elastic net (ENET) or least absolute shrinkage and selection operator (Lasso) penalty. Two methods were also developed for inferring pathway gene regulatory networks (GRNs) by combining Huber or MSE loss function with a network (Net)-based penalty. To solve these regressions, we ameliorated an accelerated proximal gradient descent (APGD) algorithm to optimize parameter selection processes, resulting in an equally effective but much faster algorithm than the commonly used convex optimization solver. The synthetic data generated in a general setting was used to test four TF–TG identification methods, ENET-based methods performed better than Lasso-based methods. Synthetic data generated from two network settings was used to test Huber-Net and MSE-Net, which outperformed all other methods. The TF–TG identification methods were also tested with SND1 and gl3 overexpression transcriptomic data, Huber-ENET and MSE-ENET outperformed all other methods when genome-wide predictions were performed. The TF–TG identification methods fill the gap of lacking a method for genome-wide TG prediction of a TF, and potential for validating ChIP/DAP-seq results, while the two Net-based methods are instrumental for predicting pathway GRNs.
more » « less
Full Text Available
HB-PLS: A statistical method for identifying biological process or pathway regulators by integrating Huber loss and Berhu penalty with partial least squares regression

https://doi.org/10.48130/FR-2021-0006

Deng, Wenping; Zhang, Kui; He, Cheng; Liu, Sanzhen; Wei, Hairong (January 2021, Forestry Research)
null (Ed.)
Full Text Available
Bacterium-enabled transient gene activation by artificial transcription factors for resolving gene regulation in maize

https://doi.org/10.1093/plcell/koad155

Zhao, Mingxia; Peng, Zhao; Qin, Yang; Tamang, Tej Man; Zhang, Ling; Tian, Bin; Chen, Yueying; Liu, Yan; Zhang, Junli; Lin, Guifang; et al (May 2023, The Plant Cell)

Abstract Understanding gene regulatory networks is essential to elucidate developmental processes and environmental responses. Here, we studied regulation of a maize (Zea mays) transcription factor gene using designer transcription activator-like effectors (dTALes), which are synthetic Type III TALes of the bacterial genus Xanthomonas and serve as inducers of disease susceptibility gene transcription in host cells. The maize pathogen Xanthomonas vasicola pv. vasculorum was used to introduce 2 independent dTALes into maize cells to induced expression of the gene glossy3 (gl3), which encodes a MYB transcription factor involved in biosynthesis of cuticular wax. RNA-seq analysis of leaf samples identified, in addition to gl3, 146 genes altered in expression by the 2 dTALes. Nine of the 10 genes known to be involved in cuticular wax biosynthesis were upregulated by at least 1 of the 2 dTALes. A gene previously unknown to be associated with gl3, Zm00001d017418, which encodes aldehyde dehydrogenase, was also expressed in a dTALe-dependent manner. A chemically induced mutant and a CRISPR-Cas9 mutant of Zm00001d017418 both exhibited glossy leaf phenotypes, indicating that Zm00001d017418 is involved in biosynthesis of cuticular waxes. Bacterial protein delivery of dTALes proved to be a straightforward and practical approach for the analysis and discovery of pathway-specific genes in maize.
more » « less
Full Text Available
Factorial estimating assembly base errors using k-mer abundance difference (KAD) between short reads and genome assembled sequences

https://doi.org/10.1093/nargab/lqaa075

He, Cheng; Lin, Guifang; Wei, Hairong; Tang, Haibao; White, Frank F; Valent, Barbara; Liu, Sanzhen (September 2020, NAR Genomics and Bioinformatics)
null (Ed.)
Abstract Genome sequences provide genomic maps with a single-base resolution for exploring genetic contents. Sequencing technologies, particularly long reads, have revolutionized genome assemblies for producing highly continuous genome sequences. However, current long-read sequencing technologies generate inaccurate reads that contain many errors. Some errors are retained in assembled sequences, which are typically not completely corrected by using either long reads or more accurate short reads. The issue commonly exists, but few tools are dedicated for computing error rates or determining error locations. In this study, we developed a novel approach, referred to as k-mer abundance difference (KAD), to compare the inferred copy number of each k-mer indicated by short reads and the observed copy number in the assembly. Simple KAD metrics enable to classify k-mers into categories that reflect the quality of the assembly. Specifically, the KAD method can be used to identify base errors and estimate the overall error rate. In addition, sequence insertion and deletion as well as sequence redundancy can also be detected. Collectively, KAD is valuable for quality evaluation of genome assemblies and, potentially, provides a diagnostic tool to aid in precise error correction. KAD software has been developed to facilitate public uses.
more » « less
Full Text Available
Chromosome-level genome assembly of a regenerable maize inbred line A188

https://doi.org/10.1186/s13059-021-02396-x

Lin, Guifang; He, Cheng; Zheng, Jun; Koo, Dal-Hoe; Le, Ha; Zheng, Huakun; Tamang, Tej Man; Lin, Jinguang; Liu, Yan; Zhao, Mingxia; et al (December 2021, Genome Biology)
null (Ed.)
Abstract Background The maize inbred line A188 is an attractive model for elucidation of gene function and improvement due to its high embryogenic capacity and many contrasting traits to the first maize reference genome, B73, and other elite lines. The lack of a genome assembly of A188 limits its use as a model for functional studies. Results Here, we present a chromosome-level genome assembly of A188 using long reads and optical maps. Comparison of A188 with B73 using both whole-genome alignments and read depths from sequencing reads identify approximately 1.1 Gb of syntenic sequences as well as extensive structural variation, including a 1.8-Mb duplication containing the Gametophyte factor1 locus for unilateral cross-incompatibility, and six inversions of 0.7 Mb or greater. Increased copy number of carotenoid cleavage dioxygenase 1 ( ccd1 ) in A188 is associated with elevated expression during seed development. High ccd1 expression in seeds together with low expression of yellow endosperm 1 ( y1 ) reduces carotenoid accumulation, accounting for the white seed phenotype of A188. Furthermore, transcriptome and epigenome analyses reveal enhanced expression of defense pathways and altered DNA methylation patterns of the embryonic callus. Conclusions The A188 genome assembly provides a high-resolution sequence for a complex genome species and a foundational resource for analyses of genome variation and gene function in maize. The genome, in comparison to B73, contains extensive intra-species structural variations and other genetic differences. Expression and network analyses identify discrete profiles for embryonic callus and other tissues.
more » « less
Full Text Available

Search for: All records